Development of a Korean OCR System Term Project in CSE 581 - Pattern Recognition

نویسنده

  • Nils Krahnstöver
چکیده

This is the final report for the term project in CSE 581 (Pattern Recognition). The goal of this project is to develop a character recognition system that is able to recognize and classify a subset of the Korean language. Training and test samples where obtained from Korean books, additional sets created by applying a degradation model to the obtained samples. Using geometrical, statistical and global features the system has a recognition rate of 90%. The classification was based on nearest neighborclassification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Script Identification – A Han & Roman Script Perspective

All Han-based scripts (Chinese, Japanese, and Korean) possess similar visual characteristics. Hence system development for identification of Chinese, Japanese and Korean scripts from a single document page is quite challenging. It is noted that a Han-based document page might also have Roman script in them. A multi-script OCR system dealing with Chinese, Japanese, Korean, and Roman scripts, dem...

متن کامل

A sustainable development OCR system in CADAL application

This paper briefly introduces the main ideas of a sustainable development OCR system based on open architecture techniques and then describes the construction of an optical character recognition (OCR) center built on computer clusters, for the purpose of dynamically improving the recognition precision of the digitized texts of a million volumes of books produced by the China-US Million Books Di...

متن کامل

Optical Character Recognition for Handwritten Cursive English characters

Optical Character Recognition (OCR) is the technique which enables a machine to automatically recognize the characters or scripts written in the users’ language. Optical Character Recognition (OCR) has become one of the most successful applications of technology in the field of pattern recognition and artificial intelligence. In this project a scanned image is translated into machine editable t...

متن کامل

A Literature Survey on Digital Image Processing Techniques in Character Recognition of Indian Languages

Handwritten character recognition is always a frontier area of research in the field of pattern recognition. There is a large demand for OCR on hand written documents in Image processing. Even though, sufficient studies have performed in foreign scripts like Arabic, Chinese and Japanese, only a very few work can be traced for handwritten character recognition mainly for the south Indian scripts...

متن کامل

A Real-time DSP-Based Optical Character Recognition System for Isolated Arabic characters using the TI TMS320C6416T

Optical Character Recognition (OCR) is an area of research that has attracted the interest of researchers for the past forty years. Although the subject has been the center topic for many researchers for years, it remains one of the most challenging and exciting areas in pattern recognition. Since Arabic is one of the most widely used languages in the world, the demand for a robust OCR for this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999